Calibrating a multi-species model to time-averaged species’ catches

In this example we will explore how we can learn about models by fitting size spectrum ecological models to data using the “mizer” R package.

Recall, there are three different kinds of size spectrum models in mizer, of increasing complexity: 1) community model: purely size-based and representative of a single but “average” species across the whole community 2) trait-based model, which disaggregates the size spectrum into differentgroups with different life-histories, through differences in each “species” asymptotic which determines other life-history parameters such as the size at maturity (Hartvig et al. 2011, Andersen & Pedersen, 2010) 3) multispecies model - which has the same equations and parameters as the trait-based model but is parameterised to represent multiple species in a real system, where each species can have many differing species-specific traits (Blanchard et al. 2014).

Here we focus on multispecies size spectrum models. In practice, these models have been parametrised in a few different ways depending on data availability for a system or research questions.

Some studies have focused on many species-specific values, for example where each species have different values of life-history, size-selective feeding trait parameters (e.g. and ), and details of species interactions (Blanchard et al. 2014, Reum et al. 208) to better capture the dynamics of marine food webs.

Others, such as Jacobsen et al. (2014, 2016), have represented variation in only a couple of the most important life history parameters for each species - asymptotic size (which links to other parameters such as maturation size) and recruitment parameters (Rmax, eRepro) to broadly capture fished communities or carry out across ecosystem comparisons.

Once you have parametrised the multispecies model for your system, you may find that species do not coexist or the biomass or catches are very different from your observations. After the model is parameterised and asssessed for basic principles and coexistence (see section x - Gustav?), further calibration to observational data is used to ensure the relative abundance of each species reflects the system, at least during a stable period, which is time-averaged.

The background resource parameters and the recruitment parameters, (maximum recruitment) and (reproductive efficiency) greatly affect the biomasses of species in your system.

The recruitment parameters are highly uncertain and capture density dependent processes in the model that limit the number of offspring that successfully recruit to the smallest size class for each species. In the default mizer package these parameters are used to implement an emergent Beverton-Holt type stock recruitment relationship.

As a starting point, we will estimate these parameters as a means of fitting the modelled species catches to the observed catches. This could similarly be carried out with biomasses. Other model detailed approaches also exist, see the main paper, but this approach has been used to get models in the right “ball-park”, which can them be further evaluated using diagnostics (example X) and fitted to time series data (example XX).

A Simple Protocol for Multispecies Model Calibration

We will adapt the “recipe” for calibration in Jacobsen et al 2014 (see supp. mat.) and Blanchard et al (2014), into the following steps:

  1. Run the model with the chosen species-specific parameters. This will relate some of the missing parameters to ( h and gamma - explain through simple example of how the model works?). (see example that explains Rmax?) could also be automatically calculated based on equilbrium assumptions (Andersen et al. 2016) but by default it is “Inf”, which means there is no density dependence associated with spawner-recruit dynamics. #RF(default is not Inf)

  2. Obtain the time-averaged data (e.g. catches or biomasses for each species) and the time-averaged fishing mortalty inputs (e.g. from stock assessments). Typically this should be over a stable part of the time series for your system.

  3. Start with the chosen parameters for and of the resource spectrum that are obtained from the literature regarding the community size spectrum. These can be very uncertain and sometimes are not available. Calibrate the carrying capacity of the background resource spectrum, , by examining the feeding level, biomass through time, and overall size spectrum.

  4. Calibrate the maximum recruitment, , which will affect the relative biomass of each species (and, combined with the fishing parameters, the catches) by minimising the error between observed and estimated catches (again or biomasses).

  5. Check that the physiological recruitment, , is much higher than the realised recruitment, . This can be done using the getRDD and getRDI functions and calculating the ratio which should be around 100 for a species with = 1500g, but varies with asymptotic size and fishing mortality (Andersen 2019). High RDI/RDD ratio indicates the carrying capacity is controlling the population rather than predation or competition. Larger species often require more of this density dependent control than smaller ones. If RDI/RDD is too high, the efficiency of reproduction (erepro) can be lowered to ensure species do not outcompete others or or over-resilient to fishing. Lowering erepro biologically means higher egg mortality rate or wasteful energy invested into gonads. If RDI/RDD = 1 the species is in the linear part of the stock recruitment relationship (no spawner-recruit density dependence).

  6. Verify the model after the above step by comparing the model with: species biomass or abundance distributions, feeding level, natural mortality, growth, vulnerablity to fishing (fmsy) and catch, diet composition. Many handy functions for plotting these are available here: https://sizespectrum.org/mizer/reference/index.html

  7. The final verification step is to force the model with time-varying fishing mortality to assess whether changes in time series in biomassess and catches capture observed trends. The model will not cpature all of the fluctuations from environmental processes (unless some of these are included), but should match the magnitude and general trend in the data.

Step 0. Run the model with the chosen species-specific parameters. This will relate some of the missing parameters to W_inf. R_max will also be automatically calculated based on equilbrium assumptions (Andersen et al. 2016).

Let’s read in the North Sea model parameters, stored in mizer.

# Initial set up
# knitr::opts_chunk$set(cache = T)

# only need to do the installations once.
#install.packages("remotes")
# library(remotes)
#install_github("sizespectrum/mizerExperimental")
library(mizerExperimental) # for projectToSteady()
library(mizer)
library(tidyverse)
library(plotly)

# loading North Sea data
nsParams <- read.csv("data/nsparams.csv")[,-1]

#If you want to make it less multi-species and more trait-based model
# sparams[,"beta"] <-100
# sparams[,"sigma"] <-1.5
nsParams[,"r_max"] <- Inf
# Let's use the North Sea model parameters, but change the parameters and assumptions so that it's essentially a different uncalibrated model, where species are not coexisting

params_uncalibrated <- newMultispeciesParams(nsParams,inter,kappa = 1e11,max_w=1e6) # inter comes with loading "mizer"
## Note: No h provided for some species, so using f0 and k_vb to calculate it.
## Note: Because you have n != p, the default value is not very good.
## Note: No ks column so calculating from critical feeding level.
## Note: Using z0 = z0pre * w_inf ^ z0exp for missing z0 values.
## Note: Using f0, h, lambda, kappa and the predation kernel to calculate gamma.
# note the volume of this model is set to the reflect the entire volume of the North Sea - hence the very large kappa value. This is system specific and you may wnat to work with per m^3 as in the defaults.

#  Add other params for info
#  param$Volumecubicmetres=5.5e13    #unit of volume. Here total volume of North sea is used (Andersen & Ursin 1977)

# have a look at species parameters that have been calculated
# params@species_params
# alternative params without redundant parameters to reduce the size of the dataframe on the screeen
params_uncalibrated@species_params[,-which(colnames(params_uncalibrated@species_params) %in% c("sel_func","gear","interaction_resource","pred_kernel_type","m"))]
##         X1 species   w_inf w_mat   beta sigma R_max  k_vb  l25  l50     a     b
## Sprat    1   Sprat    33.0    13  51076   0.8   Inf 0.681  7.6  8.1 0.007 3.014
## Sandeel  2 Sandeel    36.0     4 398849   1.9   Inf 1.000  9.8 11.8 0.001 3.320
## N.pout   3  N.pout   100.0    23     22   1.5   Inf 0.849  8.7 12.2 0.009 2.941
## Herring  4 Herring   334.0    99 280540   3.2   Inf 0.606 10.1 20.8 0.002 3.429
## Dab      5     Dab   324.0    21    191   1.9   Inf 0.536 11.5 17.0 0.010 2.986
## Whiting  6 Whiting  1192.0    75     22   1.5   Inf 0.323 19.8 29.0 0.006 3.080
## Sole     7    Sole   866.0    78    381   1.9   Inf 0.284 16.4 25.8 0.008 3.019
## Gurnard  8 Gurnard   668.0    39    283   1.8   Inf 0.266 19.8 29.0 0.004 3.198
## Plaice   9  Plaice  2976.0   105    113   1.6   Inf 0.122 11.5 17.0 0.007 3.101
## Haddock 10 Haddock  4316.5   165    558   2.1   Inf 0.271 19.1 24.3 0.005 3.160
## Cod     11     Cod 39851.3  1606     66   1.3   Inf 0.216 13.2 22.9 0.005 3.173
## Saithe  12  Saithe 39658.6  1076     40   1.1   Inf 0.175 35.3 43.6 0.007 3.075
##         catchability w_min alpha         n   p         q        h k       ks
## Sprat     1.29533333 0.001   0.6 0.6666667 0.7 0.7166667 14.46675 0 1.593753
## Sandeel   0.06510547 0.001   0.6 0.6666667 0.7 0.7166667 25.62741 0 2.936414
## N.pout    0.31380000 0.001   0.6 0.6666667 0.7 0.7166667 31.20422 0 3.372902
## Herring   0.18150000 0.001   0.6 0.6666667 0.7 0.7166667 28.36363 0 2.920263
## Dab       0.97800000 0.001   0.6 0.6666667 0.7 0.7166667 34.87720 0 3.781368
## Whiting   0.24266667 0.001   0.6 0.6666667 0.7 0.7166667 31.77220 0 3.301616
## Sole      0.37383333 0.001   0.6 0.6666667 0.7 0.7166667 24.73805 0 2.567302
## Gurnard   0.46250569 0.001   0.6 0.6666667 0.7 0.7166667 20.64990 0 2.193126
## Plaice    0.18483333 0.001   0.6 0.6666667 0.7 0.7166667 16.94072 0 1.740765
## Haddock   0.30150000 0.001   0.6 0.6666667 0.7 0.7166667 41.46028 0 4.196598
## Cod       0.26666749 0.001   0.6 0.6666667 0.7 0.7166667 69.40226 0 6.511735
## Saithe    0.39300000 0.001   0.6 0.6666667 0.7 0.7166667 59.95343 0 5.700788
##                 z0        gamma     w_mat25 erepro
## Sprat   0.18705957 5.652974e-11   11.647460      1
## Sandeel 0.18171206 3.790575e-11    3.583834      1
## N.pout  0.12926608 9.750228e-11   20.607045      1
## Herring 0.08647736 2.514308e-11   88.699888      1
## Dab     0.08735805 7.579184e-11   18.815128      1
## Whiting 0.05658819 9.927702e-11   67.196884      1
## Sole    0.06294752 5.184323e-11   69.884760      1
## Gurnard 0.06863713 4.638552e-11   34.942380      1
## Plaice  0.04171321 4.489177e-11   94.075638      1
## Haddock 0.03685027 7.707429e-11  147.833146      1
## Cod     0.01756590 2.325827e-10 1438.909287      1
## Saithe  0.01759431 2.435635e-10  964.051303      1
#lets' change the plotting colours
library(viridis)
## Loading required package: viridisLite
params_uncalibrated@linecolour[1:12] <-plasma(12)
params_uncalibrated@linecolour["Resource"] <-"seagreen3"

# run with fishing
sim_uncalibrated <- project(params_uncalibrated, t_max = 100, effort = 1)

plot(sim_uncalibrated)

Oh dear, all of the species but three have collapsed! This is because there was no density dependence (Rmax default is set at ‘Inf’) and the largest species (cod and saithe) has outcompeted all of the rest.

Step 1. Obtain the time-averaged data (e.g. catches or biomasses for each species) and the time-averaged fishing mortalty inputs (e.g. from stock assessments). Typically this should be over a stable part of the time series for your system.

The following .csv are extracted from the ICES database using “data/getICESFishdata_param.R”. Fishing data is averaged over 2014-2019 as it’s a relatively stable period in catches.

# fisheries mortality F
fMat <- read.csv("data/fmat.csv")
fMatWeighted <- read.csv("data/fmatWeighted.csv") # Sandeel and Cod have multiple data base so average their F and weighting by SSB

# read in time-averaged  catches  
catchAvg <-read.csv("data/time-averaged-catches.csv") # only that one is used at the moment | catches are estimated from fMatW

# ssb
ssbAvg <- read.csv("data/time-averaged-SSB.csv")

Step 2. Calibrate the carrying capacity of the background resource spectrum, kappa, at steady state

# the fishing mortality rates are already stored in the param object as
params_uncalibrated@species_params$catchability
##  [1] 1.29533333 0.06510547 0.31380000 0.18150000 0.97800000 0.24266667
##  [7] 0.37383333 0.46250569 0.18483333 0.30150000 0.26666749 0.39300000
# let's start again and replace with the initial pre-calibration "guessed" Rmax 
params_guessed <- params_uncalibrated
# penalise the large species with higher density dependence
params_guessed@species_params$R_max <- params_guessed@resource_params$kappa*params_guessed@species_params$w_inf^-1
# and reduce erepro
params_guessed@species_params$erepro <- 1e-3

params_guessed <- setParams(params_guessed)
# run with fishing
sim_guessed <- project(params_guessed, t_max = 100, effort =1)
plot(sim_guessed)

Here, Sprat’s biomass is orders of magnitude lower than the other species and so are Saithe’s largest individuals

## compare with Gustav's projectToSteady
params_steady <- projectToSteady(params_guessed, t_max = 100, return_sim = F)
## Convergence was achieved in 61.5 years.
sim_steady <- project(params_steady, t_max = 300, effort =1)
plot(sim_steady)

plotSpectra(sim_steady,power=2,total = T)

Species are coexisting (but Sprat’s still low). This is in part because we applied a stronger Rmax effect for larger species. You can play with the above parameters but but it would take a lot of trial an error to achieve the right combination to get the biomass or catches similar to the observations.

We could explore the effects further using Rshiny app, where we also have a plot of the biomass or catch data. First let’s look at the basic diagnostics and tune kappa and erepro to make sure the feeding levels are high enough for each species and that biomasses coexist.

# Optional
# adjust Rmax and/or reproductive efficiency to examine whether improved steady state is achievable that way
library(shiny) # no need if runtime = shiny is in the YAML
# runApp("shiny-equilibrium")
# is there a way to save the final chosen values?
params_shiny <- params_guessed

shinyApp(

  ui=fluidPage(
  
  # Application title
  titlePanel("North Sea Model Example"),
  
  fluidRow(
    column(4, wellPanel(
       sliderInput("kappa", "log10 Resource Carrying Capacity:", min = 8, max = 12, value = 10.7,
                   step = 0.1),
    #   sliderInput("Rmax", "log10 Maximum Recruitment:", min = 1, max = 12, value = 12,
    #              step = 0.1),
       sliderInput("erepro", "log10 Reproductive Efficiency:", min = -8, max = 1, value = -3,
                   step = 0.1)
          )),
    column(6,
           plotOutput("distPlot", width = 600, height = 600)
    ))
     
  
    
  ),
  
  server = function(input, output) {
   
  output$distPlot <- renderPlot({
    # set up params using values given, need check and change parameter values so units work in days units 
    params_shiny@species_params$erepro <- rep(10^input$erepro,12)
   # params@species_params$Rmax <- rep(10^input$Rmax,12)
    params_shiny <- setParams(params_shiny,kappa=10^input$kappa)
    # run without fishing
    sim_shiny <- project(params_shiny, effort = 1, t_max = 100)
    plot(sim_shiny)
     })

},

  
  


  options = list(height = 500)
)

This improves matters a little, but we need to make some species-specific adjustments.

The shiny app helps with understanding the model but it is tricky to arrive at the best fit especially if we want to change several species parameter combinations at a time.

Let’s choose some values that enable the most species to coexist as a starting point for optimisation. Note we won’t vary erepro at the same time as Rmax (they depend on each other). However we will use the value of erepro selected from the shiny app.

Varying kappa will affect the species’ growth curve, we need to check the modelled growth curves agaisnt the “observed” von Bertlanffy growth curves in case varying kappa divert too much the emergent growth curves.

plotGrowthCurvesPannel <- function(object, max_age = 20, log_scale = F)
{
            gc <- getGrowthCurves(object, max_age = max_age)[, , drop = FALSE] %>%
                cubelyr::as.tbl_cube(met_name = "Size") %>%
                as_tibble() %>%
                mutate(Legend = "Model",
                       Species = factor(Species, object@species_params$species))

            vb <- gc %>%
                mutate(Legend = "von Bertalanffy") %>%
                mutate(a = object@species_params[Species, "a"],
                       b = object@species_params[Species, "b"],
                       k_vb = object@species_params[Species, "k_vb"],
                       t0 = 0) %>% # since it's not in the param object
                filter(!is.na(k_vb)) %>%
                mutate(L_inf = (object@species_params[Species, "w_inf"] / a)^(1 / b),
                       Size = a * (L_inf * (1 - exp(-k_vb * (Age - t0))))^b) %>%
                select(names(gc))
            
           p <- ggplot(bind_rows(gc, vb)) +
                geom_line(aes(x = Age, y = Size, colour = Legend)) +
                scale_x_continuous(name = "Age [years]") +
                geom_hline(aes(yintercept = w_mat),
                           data = tibble(Species = object@species_params$species[],
                                         w_mat = object@species_params$w_mat[]),
                           linetype = "dashed",
                           colour = "grey") +
                geom_hline(aes(yintercept = w_inf),
                           data = tibble(Species = object@species_params$species[],
                                         w_inf = object@species_params$w_inf[]),
                           linetype = "solid",
                           colour = "grey") +
                facet_wrap(~Species, scales = "free_y")
           
 if(log_scale) p + scale_y_continuous(trans = "log10") else p 



}

# All emergent and observed growth curves as pannels
plotGrowthCurvesPannel(params_guessed)

# All emergent growth curves together
plotGrowthCurves(params_guessed, percentage = T)

# One by one observed vs emergent
plotGrowthCurves(params_guessed, species = "Cod")

#  Ken - Explain that the carrying capacity of the resource spectrum adjusts growth …  and that it is set by Gustavs “default” routines. ?

Step 3. Calibrate the maximum recruitment

will affect the relative biomass of each species (and, combined with the fishing parameters, the catches) by minimising the error between observed and estimated catches or biomasses. We could also include kappa in our estimation here (as in Blanchard et al 2104 & Spence et al 2016) but instead we will use the value that seemed OK in terms of feeding levels in the shiny app, roughly log10(11.5). Same goes for erepro, a value of 1e-3 seemed ok.

First let’s set up a function running the model and outputing the difference between predicted catches (getYield()) and actual catches (cdat). err is the sum of squared errors between the two.

# # change kappa and erepro based on shiny exploration, set up initial values based on "close to" equilibrium values from above sim
#RF steady params set up to erepro = 0.001 and kappa = 10^11

  # steadyparams@species_params$erepro[] <-1e-0
  # params@initial_effort<-1
  # params@initial_n<-sim@n[100,,]
  # params@initial_n_pp<-sim@n_pp[100,]
  # params <- setParams(params,kappa=10^(11))

# define the initial parameters to send to optimisation code below

# we need 12 Rmaxs, log10 scale
vary <- log10(params_steady@species_params$R_max)
#vary<-runif(10,3,12) # or use completley made up values, same for each species test for effects of initial values

## set up the enviornment to keep the current state of the simulations 
state <- new.env(parent = emptyenv())
state$params <-  params_steady

catchAvg <-read.csv("data/time-averaged-catches.csv") # only that one is used at the moment | catches are estimated from fMatW


## the following getError function combines the steps of the optimisastion above - this time with the multispecies model and output the predicted size spectrum

## update below with project_steady and saving the state from each iteration
#RF the function takes a bunch of RMax and compare the theoretical catches versus data
getError <- function(vary,params,dat,env=state,data_type="catch", tol = 0.1,timetorun=10) {
  
  #env$params@species_params$R_max[]<-10^vary[1:12]
  params@species_params$R_max[]<-10^vary[1:12]
  
  params <- setParams(params)
  # run to steady state and update params
  # env$params<- projectToSteady(env$params, distance_func = distanceSSLogN,
  #                 tol = tol, t_max = 200,return_sim = F)
  params<- projectToSteady(params, distance_func = distanceSSLogN,
                   tol = tol, t_max = 200,return_sim = F)
 
  # create sim object 
   
  sim <- project(params, effort = 1, t_max = timetorun) #Change t_max to determine how many years the model runs for
  
  # 
  # sim <- project(env$params, effort = 1, t_max = timetorun) #Change t_max to determine how many years the model runs for
  # 
  # env$params <-sim@params
  # 
  
          ## what kind of data and output do we have?
          if (data_type=="SSB") {
          output <-getSSB(sim)[timetorun,]   #could change to getBiomass if using survey, also check units.
          }
  
          if (data_type=="catch") {
         output <-getYield(sim)[timetorun,]/1e6 
         #' using n . w . dw so g per year per volume (i.e. North Sea since kappa is set up this way). 
         #'The data are in tonnes per year so converting to tonnes.
          }
  
  pred <- log(output)
  dat  <- log(dat)

  # sum of squared errors, here on log-scale of predictions and data (could change this or use other error or likelihood options)
   discrep <- pred - dat

   discrep <- (sum(discrep^2))
  
  # can use a strong penalty on the error to ensure we reach a minimum of 10% of the data (biomass or catch) for each species
  # if(any(pred < 0.1*dat)) discrep <- discrep + 1e10
  
    return(discrep)

   }

## test it
err<-getError(vary = vary, params = params_steady, dat = catchAvg$Catch_1419_tonnes)
## Convergence was achieved in 24 years.
# err<-getError(vary,params,dat=rep(100,12),data_type="biomass")
err
## [1] 85.82931

Now, carry out the optimisation. There are several optimastion methods to choose from - we need to select the most robust one to share here. The R package optimParallel seems to be the most robsut general R package and has replaced optim. Often this requires repeateing the proceure several times but the advantage of using parallel run is the speed compared to packages such as optimx.

This might take AWHILE. Go watch some Netflix. #RF De*th to Corpos, use PopcornTime instead

The output is saved as “optim_para_result” if you wish to skip this block

# read in a set of optimized params
# params<- readRDS("optim_param.RDS")
# sim <- readRDS("optim_sim.RDS")

#optim_params@resource_params$kappa <-10^5.6
#params<-setParams(steadyparams)

library("parallel")
library("optimParallel")
library("tictoc")

params_optim <- params_guessed
vary <-  log10(params_optim@species_params$R_max)


params_optim@resource_params$kappa<-3.2e11 # where does this come from?
params_optim<-setParams(params_optim)

noCores <- detectCores() - 1 # keep a spare core

cl <- makeCluster(noCores, setup_timeout = 0.5)
setDefaultCluster(cl = cl)
clusterExport(cl, as.list(ls()))
clusterEvalQ(cl, {
  library(mizerExperimental)
  library(optimParallel)
})
## [[1]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[2]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[3]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[4]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[5]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[6]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[7]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[8]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[9]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[10]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[11]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[12]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[13]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[14]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[15]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[16]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[17]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[18]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[19]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[20]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[21]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[22]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[23]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[24]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[25]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[26]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[27]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[28]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[29]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[30]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[31]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[32]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[33]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[34]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[35]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[36]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[37]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[38]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[39]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[40]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[41]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[42]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[43]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[44]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[45]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[46]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"             
## 
## [[47]]
##  [1] "optimParallel"     "parallel"          "mizerExperimental"
##  [4] "mizer"             "stats"             "graphics"         
##  [7] "grDevices"         "utils"             "datasets"         
## [10] "methods"           "base"
tic()
optim_result <-optimParallel(par=vary,getError,params=params_optim, dat = catchAvg$Catch_1419_tonnes, method   ="L-BFGS-B",lower=c(rep(3,12)),upper= c(rep(15,12)),
                            parallel=list(loginfo=TRUE, forward=TRUE))

stopCluster(cl)
toc() # 80'' using 47 cores
## 81.563 sec elapsed
saveRDS(optim_result,"optim_para_result.RDS")
# if previous block not evaluated | have some issue enabling "runtime:shiny" and loading .csv somehow
params_optim <- params_guessed
params_optim@resource_params$kappa<-3.2e11 # where does this come from?

optim_result <- readRDS("optim_para_result.RDS")
# optim values:
params_optim@species_params$R_max <- 10^optim_result$par 

# set the param object 
params_optim <-setParams(params_optim)
sim_optim <- project(params_optim, effort = 1, t_max = 100, dt=0.1,initial_n = sim_guessed@n[100,,],initial_n_pp = sim_guessed@n_pp[100,])
saveRDS(sim_optim,"optim_para_sim.RDS")
plot(sim_optim)

Step 4. Check the level of density dependence. Is the physiological recruitment, , much higher than the realised recruitment, . High RDI/RDD ratio indicates strong density dependence.

plot_dat <- as.data.frame(getRDI(sim_optim@params)/getRDD(sim_optim@params))
plot_dat$species <- factor(rownames(plot_dat),sim_optim@params@species_params$species)
colnames(plot_dat)[1] <- "ratio"

ggplot(plot_dat) +
  geom_bar(aes(x = species, y = ratio, fill = species),stat="identity") +
  scale_fill_manual(name = "Species", values = sim_optim@params@linecolour) +
  scale_y_continuous(name = "RDI/RDD") +
  scale_x_discrete(name = "Species") +
   theme(panel.background = element_rect(fill = "white", color = "black"),
        legend.position = "none") 

getRDI(sim_optim@params)/getRDD(sim_optim@params)
##      Sprat    Sandeel     N.pout    Herring        Dab    Whiting       Sole 
##   1.000534   2.940145   4.271215   2.510232   6.241042  14.389434  85.672519 
##    Gurnard     Plaice    Haddock        Cod     Saithe 
##  15.051169   1.604304  40.538347 118.077148   2.486730
plot_dat$w_inf <- sim_optim@params@species_params$w_inf

ggplot(plot_dat)+
  geom_point(aes(x = w_inf, y = ratio, color = species)) +
  scale_y_continuous(name = "RDI/RDD") +
  scale_x_continuous(name = "Asymptotic size (g)", trans = "log10") +
  scale_color_manual(name = "Species", values = sim_optim@params@linecolour) +
  theme(panel.background = element_rect(fill = "white", color = "black"),
        legend.justification=c(1,1),legend.key = element_rect(fill = "white"))

# seems like there is little density dependence 

# # if needed change erepro & plug back into model
 # params@species_params$erepro[] <-1e-3
 # params <- setParams(params)
 # sim <- project(params, effort = 1, t_max = 500, dt=0.1)
 # plot(sim)

Step 5. Verify the model after the above step by comparing the model with data.

Eg. species biomass or abundance distrubtions, feeding level, naturality mortality, growth, vulnerablity to fishing (fmsy) and catch, diet composition… Many handy functions for plotting these are available here: https://sizespectrum.org/mizer/reference/index.html

plotPredObsYield <-function(sim,dat){
## check obs vs. predicted yield
#sim<-newsim
pred_yield <-melt(getYield(sim)[100,]/1e6)
pred_yield$obs <- dat
pred_yield$species <-row.names(pred_yield)

p <- ggplot() + # plot predicted and observed yields
        geom_point(data = pred_yield, 
            aes(x = log10(value), y = log10(obs), color = species)) +
   # plot optimal fit line
        geom_abline(color = "black", slope = 1, intercept = 0) + 
  xlab("log10 Predicted Yield") + 
  ylab("log10 Observed Yield") #+ 
 # scale_fill_manual(values = wes_palette(12, "Zissou")) 
return(p)
} 

plotDiet2 <- function (object, species, xlim = c(1,NA)) 
{
    params <- validParams(object)
    if (is.integer(species)) {
        species <- params@species_params$species[species]
    }
    diet <- getDiet(params)[params@species_params$species == 
        species, , ]
    prey <- dimnames(diet)$prey
    prey <- factor(prey, levels = rev(prey))
    plot_dat <- data.frame(Proportion = c(diet), w = params@w, 
        Prey = rep(prey, each = length(params@w)))
    plot_dat <- plot_dat[plot_dat$Proportion > 0, ]
    ggplot(plot_dat) + geom_area(aes(x = w, y = Proportion, fill = Prey)) + 
        scale_x_log10(limits = xlim) + labs(x = "Size [g]") + scale_fill_manual(values = params@linecolour)
}

plotPredObsYield(sim_optim,catchAvg$Catch_1419_tonnes) 

# check other plots
plotlyBiomass(sim_optim)
plotlySpectra(sim_optim)
plotlySpectra(sim_optim,power=2,total = T)
plotlyGrowthCurves(sim_optim,percentage = T) 
plotlyFeedingLevel(sim_optim) 
plotlyPredMort(sim_optim)
plotlyFMort(sim_optim)
# plotGrowthCurves(sim,"Cod") # check each species - not sure if these kvb params are correct/or should not be used in this way (the ones I initially used were maximal values) - ideally we would have data here as growth is actually highly variable.

plotDiet2(sim_optim@params,"Cod") ## perhaps feeding too much on background resource, although here note that is "other food" too, such as benthos.
## Warning: Removed 421 rows containing missing values (position_stack).

# plotDiet(sim_optim@params,"Sprat")

# What would happen if we also parameterised the interaction matrix or beta and sigma?

Step 6. The final verification step is to force the model with time-varying fishing mortality to assess whether changes in time series in biomassess and catches capture observed trends. The model will not capture all of the fluctuations from environmental processes ( unless some of these are included), but should match the magnitude and general trend in the data. We explore this in Example # 2 - Changes through time.